7 research outputs found

    Null models and complexity science: disentangling signal from noise in complex interacting systems

    Get PDF
    The constantly increasing availability of fine-grained data has led to a very detailed description of many socio-economic systems (such as financial markets, interbank loans or supply chains), whose representation, however, quickly becomes too complex to allow for any meaningful intuition or insight about their functioning mechanisms. This, in turn, leads to the challenge of disentangling statistically meaningful information from noise without assuming any a priori knowledge on the particular system under study. The aim of this thesis is to develop and test on real world data unsupervised techniques to extract relevant information from large complex interacting systems. The question I try to answer is the following: is it possible to disentangle statistically relevant information from noise without assuming any prior knowledge about the system under study? In particular, I tackle this challenge from the viewpoint of hypothesis testing by developing techniques based on so-called null models, i.e., partially randomised representations of the system under study. Given that complex systems can be analysed both from the perspective of their time evolution and of their time-aggregated properties, I have tested and developed one technique for each of these two purposes. The first technique I have developed is aimed at extracting “backbones” of relevant relationships in complex interacting systems represented as static weighted networks of pairwise interactions and it is inspired by the well-known Pólya urn combinatorial process. The second technique I have developed is instead aimed at identifying statistically relevant events and temporal patterns in single or multiple time series by means of maximum entropy null models based on Ensemble Theory. Both of these methodologies try to exploit the heterogeneity of complex systems data in order to design null models that are tailored to the systems under study, and therefore capable of identifying signals that are genuinely distinctive of the systems themselves

    Maximum entropy approach to multivariate time series randomization

    Get PDF
    Natural and social multivariate systems are commonly studied through sets of simultaneous and time-spaced measurements of the observables that drive their dynamics, i.e., through sets of time series. Typically, this is done via hypothesis testing: the statistical properties of the empirical time series are tested against those expected under a suitable null hypothesis. This is a very challenging task in complex interacting systems, where statistical stability is often poor due to lack of stationarity and ergodicity. Here, we describe an unsupervised, data-driven framework to perform hypothesis testing in such situations. This consists of a statistical mechanical approach—analogous to the configuration model for networked systems—for ensembles of time series designed to preserve, on average, some of the statistical properties observed on an empirical set of time series. We showcase its possible applications with a case study on financial portfolio selection

    Correspondence between temporal correlations in time series, inverse problems, and the Spherical Model

    Get PDF
    In this paper we employ methods from Statistical Mechanics to model temporal correlations in time series. We put forward a methodology based on the Maximum Entropy principle to generate ensembles of time series constrained to preserve part of the temporal structure of an empirical time series of interest. We show that a constraint on the lag-one autocorrelation can be fully handled analytically, and corresponds to the well known Spherical Model of a ferromagnet. We then extend such a model to include constraints on more complex temporal correlations by means of perturbation theory, showing that this leads to substantial improvements in capturing the lag-one autocorrelation in the variance. We apply our approach on synthetic data, and illustrate how it can be used to formulate expectations on the future values of a data generating process.Comment: 9 pages, 2 figure

    A PĂłlya urn approach to information filtering in complex networks

    Get PDF
    The increasing availability of data demands for techniques to filter information in large complex networks of interactions. A number of approaches have been proposed to extract network backbones by assessing the statistical significance of links against null hypotheses of random interaction. Yet, it is well known that the growth of most real-world networks is non-random, as past interactions between nodes typically increase the likelihood of further interaction. Here, we propose a filtering methodology inspired by the Pólya urn, a combinatorial model driven by a self-reinforcement mechanism, which relies on a family of null hypotheses that can be calibrated to assess which links are statistically significant with respect to a given network’s own heterogeneity. We provide a full characterization of the filter, and show that it selects links based on a non-trivial interplay between their local importance and the importance of the nodes they belong to

    Deep Reinforcement Learning for Active High Frequency Trading

    Full text link
    We introduce the first end-to-end Deep Reinforcement Learning (DRL) based framework for active high frequency trading. We train DRL agents to trade one unit of Intel Corporation stock by employing the Proximal Policy Optimization algorithm. The training is performed on three contiguous months of high frequency Limit Order Book data, of which the last month constitutes the validation data. In order to maximise the signal to noise ratio in the training data, we compose the latter by only selecting training samples with largest price changes. The test is then carried out on the following month of data. Hyperparameters are tuned using the Sequential Model Based Optimization technique. We consider three different state characterizations, which differ in their LOB-based meta-features. Analysing the agents' performances on test data, we argue that the agents are able to create a dynamic representation of the underlying environment. They identify occasional regularities present in the data and exploit them to create long-term profitable trading strategies. Indeed, agents learn trading strategies able to produce stable positive returns in spite of the highly stochastic and non-stationary environment.Comment: 9 pages, 4 figure

    Exogenous and Endogenous Price Jumps Belong to Different Dynamical Classes

    No full text
    Synchronising a database of stock specific news with 5 years worth of order book data on 300 stocks, we show that abnormal price movements following news releases (exogenous) exhibit markedly different dynamical features from those arising spontaneously (endogenous). On average, large volatility fluctuations induced by exogenous events occur abruptly and are followed by a decaying power-law relaxation, while endogenous price jumps are characterized by progressively accelerating growth of volatility, also followed by a power-law relaxation, but slower than for exogenous jumps. Remarkably, our results are reminiscent of what is observed in different contexts, namely Amazon book sales and YouTube views. Finally, we show that fitting power-laws to {\it individual} volatility profiles allows one to classify large events into endogenous and exogenous dynamical classes, without relying on the news feed
    corecore